Incremental kernel spectral clustering for online learning of non-stationary data

نویسندگان

  • Rocco Langone
  • Oscar Mauricio Agudelo
  • Bart De Moor
  • Johan A. K. Suykens
چکیده

In this work a new model for online clustering named Incremental Kernel Spectral Clustering (IKSC) is presented. It is based on Kernel Spectral Clustering (KSC), a model designed in the Least Squares Support Vector Machines (LS-SVMs) framework, with primal-dual setting. The IKSC model is developed to quickly adapt itself to a changing environment, in order to learn evolving clusters with high accuracy. In contrast with other existing incremental spectral clustering approaches, the eigen-updating is performed in a model-based manner, by exploiting one of the Karush-Kuhn-Tucker (KKT) optimality conditions of the KSC problem. We test the capacities of IKSC with some experiments conducted on computer-generated data and a real-world data-set of PM10 concentrations registered during a pollution episode occurred in Northern Europe in January 2010. We observe that our model is able to precisely recognize the dynamics of shifting patterns in a non-stationary context.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering evolving data using kernel-based methods

Thanks to recent developments of Information Technologies, there is a profusion of available data in a wide range of application domains ranging from science and engineering to biology and business. For this reason, the demand for real-time data processing, mining and analysis is experiencing an explosive growth in recent years. Since labels are usually not available and in general a full under...

متن کامل

یادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیک‌های یادگیری معیار فاصله

Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...

متن کامل

Some New Methods for Prediction of Time Series by Wavelets

Extended Abstract. Forecasting is one of the most important purposes of time series analysis. For many years, classical methods were used for this aim. But these methods do not give good performance results for real time series due to non-linearity and non-stationarity of these data sets. On one hand, most of real world time series data display a time-varying second order structure. On th...

متن کامل

Composite Kernel Optimization in Semi-Supervised Metric

Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...

متن کامل

Online Spectral Clustering on Network Streams

Graph is an extremely useful representation of a wide variety of practical systems in data analysis. Recently, with the fast accumulation of stream data from various type of networks, significant research interests have arisen on spectral clustering for network streams (or evolving networks). Compared with the general spectral clustering problem, the data analysis of this new type of problems m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Neurocomputing

دوره 139  شماره 

صفحات  -

تاریخ انتشار 2014